[,1]
log marginal-likelihood (integration) -161.2275
log marginal-likelihood (Gaussian) -161.6121
Model Comparison and Model evaluation
“All models are wrong, some models are useful” — George Box
But..it is not always easy to distinguish between the two!
inlabru provides some options availableinlabruCriteria of fit
There are also some predictive checks for the model:
You can sample from the posterior using generate() and compute
[,1]
log marginal-likelihood (integration) -161.2275
log marginal-likelihood (Gaussian) -161.6121
[1] 284.3328
also known as Watanabe–Akaike information criterion.
[1] 284.4995
Remember: Our GLM is defined as: \[
\begin{eqnarray}
\pi(\mathbf{y}|\mathbf{u},\theta) & =\prod_i \pi(y_i|\mathbf{u},\theta)& \ \text{ likelihood}\\
\pi(\mathbf{u}|\theta)& &\ \text{ LGM}\\
\pi(\theta )& &\ \text{ hyperprior}\\
\end{eqnarray}
\] using inlabru we estimate the posterior distribution \(\pi(\mathbf{u},\theta|\mathbf{y})\).
The posterior predictive distribution for a new data \(\hat{y}\) is then: \[ \pi(\hat{y}|\mathbf{y}) = \int\pi(y_i|\mathbf{u},\theta)\pi(\mathbf{u},\theta|\mathbf{y})\ d\mathbf{u}\ d\theta \]
This ditribution can be used to check the model fit!
\[ \pi(\hat{y}|\mathbf{y}) = \int\pi(y_i|\mathbf{u},\theta)\pi(\mathbf{u},\theta|\mathbf{y})\ d\mathbf{u}\ d\theta \]
NOTE: In general this is NOT computed by inlabru but needs to be approximated
Use generate to sample from the posterior \((\mathbf{u}^*_i,\theta^*_i)\sim\pi(\mathbf{u},\theta|\mathbf{y})\)
Simulate a new datapoint \(y^*_i\sim(y_i|\mathbf{u}^*_i,\theta^*_i)\)
Use \(y^*_1,\dots, y^*_N\) to approximate the posterior predictive distribution.
BUT inlabru computes automatically two quantities that are useful for model check!
Definition: \[ cpo_i = \pi(y^{obs}_i|\mathbf{y}_{-i}) \]
How to compute:
[1] 0.20245537 0.36844305 0.29457897 0.05590256 0.01331477 0.39660432
Note it is possible to check for possible fails in computed CPOs
Definition: \[ pit_i = \text{Prob}(\hat{y}_i<y_i|\mathbf{y}_{-i}) \]
Linked to leave-one-out cross-validation
\(pit_i\) shows how well the ith data point is predicted by the rest of the data
Very small values indicate “suprising” observation under the model
For well-calibrated, the PIT values should be approximately uniformly distributed.
How to compute:
In the literature there are many proposed scores for evaluate predictions. For example:
They all have their strength and wakness and which one is better depends on the goals of the model.
inlabru does not provide such scores automatcally, but they can be computed using simulations from the posterior distribution.
Our model: \[ \begin{eqnarray} y_i|\lambda_i & \sim \text{Poisson}(\lambda_i),&\ i = 1,\dots,N_{\text{data}}\\ \log(\lambda_i) = \eta_i &= \beta_0 + \beta_1 x_i \end{eqnarray} \]
Simulate data and fit the model:
The CRPS score is defined as: \[ \text{S}_{\text{CRPS}}(F_i, y_i) = \sum_{k=0}^\infty\left[\text{Prob}(Y_i\leq k|\mathbf{y})-I(y_i\leq k)\right]^2 \]
Computational algorithm:
Simulate \(\lambda^{(j)}\sim p(\lambda|\text{data}), j = 1,\dots, N_{\text{samples}}\) using generate() (size \(N\times N_\text{samples}\)).
For each \(i=1,\dots,N_{\text{data}}\), estimate \(r_{ik}=\text{Prob}(Y\leq k|\text{data})-I(y_i\leq k)\) as \[ \hat{r}_{ik} = \frac{1}{N_\text{samples}} \sum_{j=1}^{N_\text{samples}} \{ \text{Prob}(Y\leq k|\lambda^{(j)}_i)-I(y_i\leq k) \} . \]
Compute \[ S_\text{CRPS}(F_i,y_i) = \sum_{k=0}^{K} \hat{r}_{ik}^2 \]
Implementation:
# some large value, so that 1-F(K) is small
max_K <- ceiling(max(df_pois$y) + 4 * sqrt(max(df_pois$y)))
k <- seq(0, max_K)
kk <- rep(k, times = length(df_pois$y))
i <- seq_along(df_pois$y)
pred_pois <- generate(fit_pois, df_pois,
formula = ~ {
lambda <- exp(Intercept + x)
ppois(kk, lambda = rep(lambda, each = length(k)))
},
n.samples = 2000
)
results <- data.frame(
i = rep(i, each = length(k)),
k = kk,
Fpred = rowMeans(pred_pois),
residuals =
rowMeans(pred_pois) - (rep(df_pois$y, each = length(k)) <= kk)
)
crps_scores <-
(results %>%
group_by(i) %>%
summarise(crps = sum(residuals^2), .groups = "drop") %>%
pull(crps))
summary(crps_scores) Min. 1st Qu. Median Mean 3rd Qu. Max.
0.393 1.726 5.307 12.659 16.446 82.424
A better option is to use posterior predictive checks1
\[ y_i|\eta_i\sim\mathcal{N}(\eta_i, \sigma^2) \]
generate()
Samples
|
||||||
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | ... | M | |
| $y_1$ | 4.11 | 3.43 | 3.89 | 2.9 | ... | 2.78 |
| $y_3$ | 2.69 | 4.56 | 3.3 | 5.39 | ... | 3.61 |
| ... | ... | ... | ... | ... | ... | ... |
| $y_N$ | 2.94 | 3.78 | 1.51 | 2.29 | ... | 2.5 |
Here we compare the estimated posterior densities \(\hat{\pi}^k(y|\mathbf{y})\) with the estimated data density
This is just a simple example, but more complex checks can be computed with the same idea!
This is a new option for cross-validation in inlabru . . .
inla.group.cv() function1Point processes (and so LGCP) are different from all other spatial models:
This makes model evaluation especially challenging. Especially cross-validation based measures are hard to define… why?
Warning⚠️
Do not use WAIC and DIC as computed today by
inlabruto compare LGCP models
Work is going on about this and measures of fit for LGCP will soon be available in inlabru
These residuals can be used to evaluate the model.
This is still work in progress and at the moment non easily available… but it will be 😀
Here you can see some examples of computation and use.
inlabru provides some easy to compute alternatives